iht method
Structured Sparse Regression via Greedy Hard-thresholding
Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups. For very large datasets and under standard sparsity constraints, hard thresholding methods have proven to be extremely efficient, but such methods require NP hard projections when dealing with overlapping groups. In this paper, we show that such NP-hard projections can not only be avoided by appealing to submodular optimization, but such methods come with strong theoretical guarantees even in the presence of poorly conditioned data (i.e. say when two features have correlation 0.99), which existing analyses cannot handle. These methods exhibit an interesting computation-accuracy trade-off and can be extended to significantly harder problems such as sparse overlapping groups. Experiments on both real and synthetic data validate our claims and demonstrate that the proposed methods are orders of magnitude faster than other greedy and convex relaxation techniques for learning with group-structured sparsity.
Structured Sparse Regression via Greedy Hard Thresholding
Jain, Prateek, Rao, Nikhil, Dhillon, Inderjit S.
Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups. For very large datasets and under standard sparsity constraints, hard thresholding methods have proven to be extremely efficient, but such methods require NP hard projections when dealing with overlapping groups. In this paper, we show that such NP-hard projections can not only be avoided by appealing to submodular optimization, but such methods come with strong theoretical guarantees even in the presence of poorly conditioned data (i.e. say when two features have correlation $\geq 0.99$), which existing analyses cannot handle. These methods exhibit an interesting computation-accuracy trade-off and can be extended to significantly harder problems such as sparse overlapping groups. Experiments on both real and synthetic data validate our claims and demonstrate that the proposed methods are orders of magnitude faster than other greedy and convex relaxation techniques for learning with group-structured sparsity.
Structured Sparse Regression via Greedy Hard-Thresholding
Jain, Prateek, Rao, Nikhil, Dhillon, Inderjit
High dimensional problems where the regressor belongs to a small number of groups play a critical role in many machine learning and signal processing applications, such as computational biology [9] and multitask learning [14]. In most of these cases, the groups overlap, i.e., the same feature can belong to multiple groups. For example, gene pathways overlap in computational biology applications, and parent-child pairs of wavelet transform coefficients overlap in signal processing applications. The existing state-of-the-art methods for solving such group sparsity structured regression problems can be categorized into two broad classes: a) convex relaxation based methods, b) iterative hard thresholding (IHT) or greedy methods. In practice, IHT methods tend to be significantly more scalable than the (group-)lasso style methods that solve a convex program. But, these methods require a certain projection operator which in general is NPhard to compute and often certain simple heuristics are used with relatively weak theoretical guarantees. Moreover, existing guarantees for both classes of methods require relatively restrictive assumptions on the data, like Restricted Isometry Property or variants thereof [2, 8, 18, 14], that are unlikely to hold in most common applications. In fact, even under such settings, the group sparsity based convex programs offer at most polylogarithmic gains over standard sparsity based methods [18].
Iterative Hard Thresholding Methods for $l_0$ Regularized Convex Cone Programming
In this paper we consider $l_0$ regularized convex cone programming problems. In particular, we first propose an iterative hard thresholding (IHT) method and its variant for solving $l_0$ regularized box constrained convex programming. We show that the sequence generated by these methods converges to a local minimizer. Also, we establish the iteration complexity of the IHT method for finding an $\epsilon$-local-optimal solution. We then propose a method for solving $l_0$ regularized convex cone programming by applying the IHT method to its quadratic penalty relaxation and establish its iteration complexity for finding an $\epsilon$-approximate local minimizer. Finally, we propose a variant of this method in which the associated penalty parameter is dynamically updated, and show that every accumulation point is a local minimizer of the problem.